A Maximum Entropy Model of Phonotactics and Phonotactic Learning
نویسندگان
چکیده
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. Possible words are assessed by these grammars based on the weighted sum of their constraint violations. The learning algorithm is robust against errors in the training data and yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with any constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE-style constraint format, suffices to learn many phonotactic phenomena. In order to learn nonlocal phenomena such as stress and vowel harmony, it is necessary to augment the model with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations. We apply the model to English syllable onsets, Shona vowel harmony, quantity-insensitive stress typology, and the full phonotactics of Wargamay, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict experimental findings. * We would like to thank Steven Abney, Jason Eisner, Robert Malouf, Donca Steriade, and audiences at the University of Michigan, University of California at San Diego, and UCLA for helpful input on our project. Hayes/Wilson Maximum Entropy Phonotactics p. 2
منابع مشابه
A Maximum Entropy Model of Phonotactics and
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle o...
متن کاملA model of rapid phonotactic generalization
The phonotactics of a language describes the ways in which the sounds of the language combine to form possible morphemes and words. Humans can learn phonotactic patterns at the level of abstract classes, generalizing across sounds (e.g., “words can end in a voiced stop”). Moreover, they rapidly acquire these generalizations, even before they acquire soundspecific patterns. We present a probabil...
متن کاملPhonotactic Knowledge and the Ac
Phonological alternations often serve to modify forms so that they respect a phonotactic restriction that applies across the language. For example, the voicing alternation in the English plural produces word-final sequences that respect the general ban against a voiceless obstruent followed by a voiced one. Since Chomsky and Halle [1], it has been assumed that an adequate theory of phonology sh...
متن کاملPhonotactics, density, and entropy in spoken word recognition
Previous research has demonstrated that increases in phonotactic probability facilitate spoken word processing, whereas increased competition among lexical representations is often associated with slower and less accurate recognition. We examined the combined effects of probabilistic phonotactics and lexical competition by generating words and nonwords that varied orthogonally on phonotactics a...
متن کاملThe timecourse of phonotactic learning
Speakers show sensitivity to the sound patterns possible in their language (phonotactics patterns). These patterns can involve specific sound sequences (e.g. bb) or more general classes of sequences (e.g. two identical consonants). In some bottom-up models of phonotactic learning, generalizations can only be formed once some of their specific instantiations have been acquired. To test this assu...
متن کامل